Transposition Invariant Pattern Matching for Multi-Track Strings

نویسندگان

  • Kjell Lemström
  • Jorma Tarhio
چکیده

We consider the problem of multi-track string matching. The task is to find the occurrences of a pattern across parallel strings. Given an alphabet Σ of natural numbers and a set S over Σ of h strings si = s1 · · · s i n for i = 1, . . . , h, a pattern p = p1 · · · pm has such an occurrence at position j of S if p1 = s i1 j , p2 = s i2 j+1, . . . , pm = s im j+m−1 holds for i1, . . . , im ∈ {1, . . . , h}. An application of the problem is music retrieval where occurrences of a monophonic query pattern are searched in a polyphonic music database. In music retrieval it is even more pertinent to allow invariance for pitch level transpositions, i.e., the task is to find whether there are occurrences of p in S such that the formulation above becomes p1 = s i1 j + c, p2 = s i2 j+1 + c, . . . , pm = s im j+m−1 + c for some constant c. We present several algorithms solving the problem. Our main contribution, the MP algorithm, is a transposition-invariant bit-parallel filtering algorithm for static databases. After an O(nhe) time preprocessing, it finds candidates for transposition invariant occurrences in time O(n⌈m/w⌉+m + d) where w, e, and d denote the size of the machine word in bits and two factors dependent on the size of the alphabet, respectively. A straightforward algorithm is used to check whether the candidates are proper occurrences. The algorithm needs time O(hm) per candidate. ACM CCS

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Restricted Transposition Invariant Approximate String Matching Under Edit Distance

Let A and B be strings with lengths m and n, respectively, over a finite integer alphabet. Two classic string mathing problems are computing the edit distance between A and B, and searching for approximate occurrences of A inside B. We consider the classic Levenshtein distance, but the discussion is applicable also to indel distance. A relatively new variant [8] of string matching, motivated in...

متن کامل

KMP Based Pattern Matching Algorithms for Multi-Track Strings

Multi-track string is an N -tuple strings of length n. For two multi-track strings T = (t1, t2, . . . , tN ) of length n and P = (p1, p2, ..., pM ) of length m, permuted pattern matching is a problem to find all positions i such that P is permuted match with T[i : i+M ]. We propose three new algorithms for permuted pattern matching based on the KMP algorithm. The first algorithm is an exact mat...

متن کامل

Position Heaps for Permuted Pattern Matching on Multi-Track String

A multi-set of N strings of length n is called a multi-track string. The permuted pattern matching is the problem that given two multi-track strings T = {t1, . . . , tN} of length n and P = {p1, . . . , pN} of length m, outputs all positions i such that {p1, . . . , pN} = {t1[i : i+m−1], . . . , tN [i : i+m−1]}We propose two new indexing structures for multi-track stings. One is a time-efficien...

متن کامل

Searching Monophonic Patterns within Polyphonic Sources

The string matching problem for strings in which one should find the occurrences of a pattern string within a text, is well-studied in the past literature. The problem can be solved efficiently, e.g., by using so-called bit-parallel algorithms. We adapt the bit-parallel approach to music information retrieval. We consider a situation where the pattern is monophonic and the text (the musical sou...

متن کامل

Transposition invariant string matching

Given strings A = a1a2 . . . am and B = b1b2 . . . bn over an alphabet Σ ⊆ U, where U is some numerical universe closed under addition and subtraction, and a distance function d(A,B) that gives the score of the best (partial) matching of A and B, the transposition invariant distance is mint∈U{d(A+ t,B)}, where A+ t = (a1 + t)(a2 + t) . . . (am + t). We study the problem of computing the transpo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Nord. J. Comput.

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2003